Model Selection

Self-play optimization

# Self-play optimization

Llama 3 Instruct 8B SPPO Iter3

A large language model developed in the third iteration using the Self-Play Preference Optimization method based on the Meta-Llama-3-8B-Instruct architecture.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase